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INTRODUCTION 

To fully analyze high-energy data, one would dearly love to measure the 
distributions of final-state quarks and gluons. However, owing to the confine- 
ment of colour charge these are not the final-state particles of the reaction, 
colourless hadrons areQ. This means that we should instead discuss event 
properties that satisfy the following conditions: 

• Well-defined and easy to measure from the hadronic final-state. 

• Well-defined and easy to calculate order-by-order in perturbation theory 
from the partonic final-state. 

• Have a close correspondence with the distributions of the final-state 
quarks and gluons that we are really interested in. 

These event properties are generally called jets. 

It should be immediately apparent that there will be many event properties 
that satisfy these conditions and hence many possible ways of defining jets. 
Although there are certainly some that are better than others, in the sense 
that they are more reliably measurable or calculable, we take the view for 
now that all definitions are created equal. There is of course an important 
corollary to this — since the definition is not unique, it is not surprising to 
find that results depend on it: What You See Is What You Look For, or 
WYSIWYLF (0). 

The last of these three conditions is the cause of a great deal of confusion, 
because it has the direct practical consequence that in leading order pertur- 
bative calculations there is a one-to-one correspondence between partons and 

1 To avoid confusion, I should point out at this stage that I use the word "hadron" 
rather loosely to mean any particle produced by the hadronization process, which 
also includes soft photons and leptons coming from secondary hadron decays. 
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jets. This can make it very easy to make the mistake of thinking that we 
are actually measuring a primary parton, instead of merely an event property 
that is largely determined by a primary parton. It also means that all jet def- 
initions give equal results at leading order and it is only at higher orders that 
one can calculate the dependence on the definition, which is certainly seen 
in the data This underlines the importance of making jet calculations 
beyond leading order, either exactly in perturbation theory as described in 
Walter Giele's talk (||), or using parton shower methods as I describe below. 

A typical analysis proceeds according to the general pattern shown in Fig. [|. 
In QCD studies, the individual processes that take us from left to right are 
interesting in their own right, while for other studies the really crucial question 
is how well these processes are modeled so that parton-level predictions can 
be compared with detector-level measurements. 

There are several important issues that need to be addressed when making 
such analyses: 

• Which of the four things labeled "jets" should we be aiming to measure 
and calculate? 

• How well do we understand this loop? 

• How can we improve our understanding and predictions? 

The first of these is a question of demarcation — should theorists be correcting 
their predictions to hadron level, or should experimenters be correcting their 
data to parton level and is largely a question of personal taste. The other two 
questions are the main themes of this talk. 

I will begin by describing the current norm in jet definitions — the cone 
algorithm. In fact I should say cone algorithms, since everyone seems to have 
their own slightly different preferred version. Then I will describe a recently 
proposed family of cluster-based 'k±' jet algorithms, which are descended from 
those used in e + e~ annihilation, and which have a variety of advantages over 
cone-based algorithms. 

Besides the exact matrix-element calculations described in Walter's talk, the 
main method used to calculate jet properties is the parton shower approach, 
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FIG. 1. Schematic diagram of a typical analysis using jets. 
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cither implemented as Monte Carlo simulations, or explicitly as analytical 
calculations. I will describe the common theoretical basis of these, as well as 
a few more technical points associated with the Monte Carlo algorithms. I 
will also highlight an important area in which improvements can be expected 
in the near future. 

Hadronization corrections are not understood at a fundamental level at 
present and are generally estimated using Monte Carlo programs. However, 
since they do not play too big a role in most collider jet studies I do not spend 
too much time describing them. 

JET DEFINITIONS 

The first two requirements of a jet definition, that the jets be easy to mea- 
sure and calculate reliably, turn out to be very similar. This has meant that 
experimental and theoretical improvements have tended to go hand in hand, 
with modifications proposed on theoretical grounds tending to result in im- 
provements in the experimental properties of the algorithm and vice versa. 
This is because many of the problems are in fact the same. 

In perturbative QCD there is a collinear divergence when any two massless 
partons are parallel. In the total cross-section, this divergence is guaranteed 
to be canceled by a contribution from the virtual correction to the equivalent 
process, with the two partons replaced by their sum. However, for this cance- 
lation to also take place in the jet calculation, it is necessary to ensure that a 
collinear pair of particles are treated identically to a single particle with their 
combined momentum. This means that algorithms that use information such 
as which particle in the event had the highest energy cannot be calculated 
perturbatively, since the resulting jet properties could be altered by replacing 
the hardest particle in the event by two or more collinear particles none of 
which is any longer the hardest one in the event. From the experimental point 
of view, the equivalent problem is the fact that parallel particles go into the 
same calorimeter cell and can never be resolved. Thus any algorithm that 
depended critically on resolving a pair of almost collinear particles would give 
results that depended strongly on the angular resolution of the calorimeter. 

Likewise the requirement of infrared safety, i.e. insensitivity to emission of 
low energy particles, is necessary in perturbative calculations to avoid the soft 
divergence and in experiments to avoid bias from the threshold trigger of a 
calorimeter cell and the background noise. 

Another requirement that has recently become topical in c + c~ and DIS 
physics is that the definition be fairly local in angle. This is experimentally 
useful because of the transverse size of a hadronic shower in the calorimeter — 
often the energy from a single hadron is spread over several calorimeter towers, 
and it is clear that the jet definition should tend to put all this energy into the 
same jet. However, with the JADE algorithm, which has been the standard 
in those collisions for some time, this is not always the case. Also, if the 
hadron happens to be near the edge of the jet cone, a cone algorithm will 
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neglect the energy falling outside it. On the other hand, in the k± cluster 
algorithm, all this energy will be clustered together, before the final decision 
is made about whether to include it into the jet is taken. This also results in 
improved theoretical properties as I will describe below. 

Cone Algorithms 

Cone algorithms have been the standard way to define jets in pp collisions 
for many years. They are conceptually very simple to define, as a direction 
that maximizes the energy flowing through a cone drawn around it. However, 
the complications start when you consider what happens when two of these 
cones overlap. 

Although the 'Snowmass Accord' (^) agreed in principle the details of the 
cone algorithm, so that theorists and experimenters could use the same def- 
inition, this very important issue was neglected and has plagued jet studies 
ever since. It has been found that the properties of the resulting jets depend 
strongly on the exact treatment of the overlap region. This is not really a 
problem, since we do expect that different jet definitions will give different 
jets. The big problem is the fact that the way the properties change depend 
on the number of particles in the jet, giving very big differences between the 
jet properties at parton, hadron and detector level. 

One example of this effect was studied in detail by Steve Ellis and company 
when comparing their parton-level predictions with CDF data. They found 
that owing to the width of a hadronic jet, configurations that were called one 
jet in their calculation could be called two jets at hadron level, as illustrated 
in Fig. |2[ They found that the hadron-level algorithm could be simulated at 
parton level by introducing a parameter R scp , in addition to the jet radius R, 
such that two partons are merged into one jet only if the conditions 

R\ < R & i?2 < R & Ri ~t~ R2 < Rscp 

are satisfied. The first two are the Snowmass Accord, while the third is the 
additional requirement that the partons not be too close to the edge of the 




FIG. 2. Schematic diagram of a jet configuration in which the cone algorithm at 
parton- and hadron-level give very different answers. 
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FIG. 3. The energy profile of a 100 GeV jet, taken from ([|. 



cone. It was found that good agreement with CDF data (Q) could be obtained 
by setting R sep = 1.3R, as shown in Fig. |3|. However, the danger in this is that 
the value of R sep is not given to us by the jet definition and could be viewed 
as a phenomenological parameter that should be tuned to data. In this sense, 
there is no reason to suppose that the value of 1.3R should also describe jets 
in the far forward region, or jets produced in a completely different reaction, 
like top quark decays. 

One very nice feature of cone algorithms is the apparent ease with which 
energy corrections can be made. This is because they are purely geometrical, 
so the amount of out-of-cone showering in the calorimeter for example, can 
be very easily calculated from the known detector response and the energy 
inside the cone. Another type of correction that is sometimes applied is for 
the amount of the jet's energy that is radiated outside the cone and the 
amount of energy from the underlying event that sneaks into the cone. When 
comparing with leading order calculations this may appear straightforward 
since they do not account for the energy spread of the jet. However, at 
next-to-leading order, the calculation already includes the fact that some of a 
jet's energy is radiated outside the cone, so this should not be corrected for. 
Furthermore, the underlying event correction is inevitably model-dependent, 
even when it is measured from data. This is because in some models the jet 
pedestal is correlated with the jet momentum and direction, while in others 
it is completely uncorrelated. Thus different experimental procedures such as 
subtracting the average seen in a cone in minimum bias events or the average 
seen in a random direction in jet events at the same jet transverse momentum 
will give results that are interpreted differently in different models. The effect 
of such assumptions on jet data is discussed in detail in (^). An experimental 
test was proposed in (||) that would shed a lot of light on these problems, since 
it claims to disentangle how correlated the jet activity and pedestal height are. 
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k± Clustering Algorithms 

Clustering algorithms have been the main way of defining jets in e + e _ 
annihilation for many years. They work in a very different way from cone 
algorithms — instead of globally finding the jet direction, they start by finding 
pairs of particles that are 'nearby' in phase-space and merging them together 
to form new pseudoparticles. This continues iteratively until the event consists 
of a few well-separated pseudoparticles, which are the output jets. There 
is however, no unique definition of 'closeness' in phase-space and different 
definitions define different algorithms. Traditionally invariant mass was used, 
but this has the disadvantage that it is extremely non-local in angle — if two 
particles have low enough energy, they will be merged together, regardless of 
how far apart they are in angle ([)]). This can be solved by using as closeness 
the momentum of the softer particle transverse to the axis of the harder ([h]) , 
as in the k± jet algorithm^. This means that, roughly speaking, the algorithm 
repeatedly merges the softest particle in the event with its nearest neighbour 
in angle. It has the great theoretical advantage that it allows the phase-space 
for multiple emission to be factorized in the same way as the QCD matrix 
elements, allowing analytical parton shower techniques to be used (fill), as I 
describe below. 

However, in collisions with incoming hadrons, there are additional particles 
in the final state that are not associated with hard jets, namely those coming 
from the hadron remnant and underlying event. For many years this was 
seen as a barrier to using clustering algorithms in hadronic collisions, as their 
property of exhaustively assigning every final state particle to a jet is clearly 
unphysical there. This problem was considered in (|l2|), where it was shown 
that it could be overcome by introducing an additional particle into the event 
by hand, parallel to the incoming beams. In the case of the k± algorithm, 
this extra particle can be considered as having infinite momentum. The re- 
sulting jet cross-sections are guaranteed to satisfy the factorization theorem, 
so absolute predictions using p.d.f.s measured in other processes can be made 
for their values, unlike earlier attempts to solve this problem. 

As far as the practical properties of the algorithm are concerned, it is es- 
sential for jet algorithms for hadron-hadron collisions to be invariant under 
longitudinal boosts along the beam direction. A set of longitudinally-invariant 
/c^-clustering algorithms for hadron-hadron collisions was proposed in (|l3|). 
Briefly, the algorithm proceeds as follows: 

1. For every pair of particles, define a closeness 

dij = mm(E Tl ,E T j) 2 AR 2 , AR 2 = Arj 2 + Ac/) 2 . 

Note that for small opening angle, AR < 1, we have 
2 The k± jet algorithm for e + e~ is sometimes called the 'Durham algorithm'. 
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mm{E Ti ,E Tj ) 2 AR 2 » min(^, £j) 2 A0 2 « fe^. 

2. For every particle, define a closeness to the beam particles, 

dib = E n R 2 , 

where R is an adjustable parameter of the algorithm introduced in jl^). 

3. If minjcJy } < mmldn,}, merge particles i and j. 

4. If minjc?^} < miri{djj}, jet i is complete. 

These steps are iterated until a given stopping condition is satisfied, as I 
discuss in more detail in a moment. Different ways of merging two four- 
vectors into one define different schemes — the two most common are the 
"E" -scheme (simple four- vector addition) and the "pt" -scheme: 

Exij = En + Etj , 
Vij = (EnVi + EtjVj) /E TlJ , 
<f>ij = {E Ti (t>i + E T j4>j) /E Tlj , 

which become equivalent for small opening angle. Although there are some 
small practical differences, they are not important here. 

Depending on what kind of studies one is interested in, different stopping 
conditions are useful. For inclusive jet studies, one iterates the above steps 
until all jets are complete ([l4|). In this case, all opening angles within each 
jet are < R and all opening angles between jets are > R. This means that 
the resulting jets are very similar to those produced by the cone algorithm, 
with R ~ R scp . As shown in Fig. ||, this is certainly true of the inclusive 
jet cross-section, for which the two algorithms are almost indistinguishable at 
next-to-leading order. It is worth noting that for a relatively soft jet, dij is its 
transverse momentum relative to the nearest hard jet, while da is R 2 times 
its transverse momentum relative to the incoming beams. If d^ is the smaller 
then the jet is treated as initial-state radiation, giving a resolvable jet, while 
if it is the larger it is treated as final-state radiation, being merged into the 
other jet. Thus the value R = 1 is strongly preferred theoretically, as it treats 
initial- and final-state radiation on equal footings and would be expected to 
give smallest higher-order corrections, at least from the dominant logarithmic 
regions of phase-space. In this sense, the empirical relationship between R 
and i?cone can be used as an a posteriori justification for using R cone = 0.7. 
The first experimental results using this algorithm were reported in (|l5|). 

In many cases, one instead wants to reconstruct exclusive final states, for 
example in top quark decay. In this case, one iterates the above steps until 
all jet pairs have dij > d cu t, an adjustable parameter of the algorithm. All 
complete jets with d^ < d cut are discarded (merged with the beam remnants). 
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Thus d cut acts as a global cutoff on the resolvability of emission, initial- or 
final-state. In this case setting R = 1 is even more strongly recommended, 
as in the original algorithm of (|f3|). One can either fix d cut a priori, only 
resolving radiation above the given k± cutoff, or adjust it event-by-event to 
reconstruct a given number of final-state jets. 

Although the cluster jets are very similar to the cone jets in terms of the 
inclusive cross-section, they have several practical advantages. Firstly, the jet 
overlap problem has completely disappeared — the algorithm unambiguously 
assigns every particle to a single jet. It does this in a dynamic way, adjusting 
to the shapes of individual jets, and so performs better than any fixed strat- 
egy such as drawing a dividing line half way between the centres or giving all 
the energy to the higher-energy of the two. One can of course always come 
up with pathological jet configurations where any given algorithm does not 
cluster in the way that seems natural, but the relative contribution of such 
configurations is much smaller for the cluster algorithm than the cone. This 
means that exactly the same algorithm can be used on hadronic final states 
with many particles as on partonic final states with one or two, without the 
need for additional adjustable parameters. Secondly, the cluster algorithm 
is much less sensitive to perturbations from soft particles than the cone al- 
gorithm, which results in smaller hadronization and detector corrections, as 
well as a reduction in the model-dependence of these corrections. This is be- 
cause in some sense it pays the most attention to the core of the jet and only 
merges other neighbouring particles if they are near enough to do so, whereas 
the cone algorithm, which seeks to maximize the jet energy, does its best to 
pull in as much neighbouring energy as possible, as illustrated in Fig. |^. This 
feature means that the k± cluster algorithm is particularly suited for kine- 
matic reconstruction of particle decays. A detailed Monte Carlo comparison 




FIG. 4. The inclusive jet cross-section for central jets with Et = 100 GeV, as a 
function of R for the cluster algorithm and f.35i? CO ne for the cone algorithm, for 
three different values of the renormalization and factorization scales, \x. The barely 
discernible pairs of curves are from the cone and cluster algorithms. Taken from (14), 
where more precise details of the calculation can be found. 
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was made in ( |l6| ) for the cases of top quark reconstruction at Tevatron energy 
and Higgs boson at LHC energy. The results for the former are shown in 
Fig. U for the mass of the three-jet combination that minimizes 

2 = ( m ii - ( m n) Y , ( ™>m -m iui \ 2 
X \ 5 GeV J \ 10 GcV J ' 

after the cut x 2 < 3. The other cuts are described in (fHf). The cluster al- 
gorithm's mass distribution is centred on the true value and is much more 
symmetric than the cone algorithm's. Also the efficiency of the cluster al- 
gorithm is considerably higher, owing to the greater cleanliness of the event 
reconstruction, so that even though the widths of the two distributions are 
similar, the cluster algorithm would give a smaller error on the reconstructed 
mass. Another reason for this greater cleanliness is that the cluster algorithm 
is somewhat like a cone algorithm with its radius adjusted event-by-event to 
suit the individual event dynamics — when there are two hard jets near each 
other the effective radius is small to resolve them well, when they are all far 
apart it is large, allowing a good measurement of their energies. 



Internal Jet Structure 

It is clearly important to study the internal structure of jets, both as a test of 
QCD and to understand the efficiencies, corrections, etc, of jet reconstruction 
for other studies. In the cone algorithm, the natural way to do this is by 
studying the spread of energy over various annular radii, as shown in Figs. ^ 
and H In the cluster algorithm, one has the possibility to study the internal 
structure in a way that is much more like how we believe jets develop - 
their structure comes not from a general smearing in angle, but by radiating 




FIG. 5. The energy profiles of 100 GeV cone and cluster jets, taken from (|l4[). The 
cluster jets (labeled "Comb") have more energy concentrated in the centremost bin, 
while the cone jets are more diffuse. At the jet edge, r = R — R CO na = 1, the trend 
is reversed — the cluster algorithm ignores energy more than R away so gives a flat 
background contribution, while the cone algorithm has tried to pull this energy into 
the jet, leaving a deficit outside. 
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individual partons that give rise to smaller 'sub-jets' within the jet. Thus 
by resolving these sub-jets we can compare their distributions with those 
predicted by QCD or by Monte Carlo models. 

Sub-jets are defined by first running the inclusive version of the algorithm 
to find a jet. Then the algorithm is rerun starting from only the particles that 
were part of that jet, stopping when all pairs have 

where Et is the transverse momentum of the jet and y cu t is an adjustable 
resolution parameter. For y cut ~ 1 the jet always consists of only 1 sub-jet 
and for y cut — » every hadron is considered a separate sub-jet, so adjusting 
y cu t allows us to go smoothly into the hadronization region in a well-defined 
way and to study the parton— ^rnany partons— >hadrons transition in great 
detail. Further details and the first experimental results can be found in (|l7|). 
Similar studies have also been performed in e + e~ annihilation (|l8|,[l9j) and 
will allow direct comparisons of the two sources of jets. 

Of course, sub-jets can also be defined in the cone algorithm, by rerunning 
using a smaller jet radius. Results of a Monte Carlo comparison are shown 
in Fig. [7| where it can be seen that the cone algorithm has much larger 
hadronization correction. The reasons for this are discussed in (EQ)rl 



3 Note that this figure supersedes the one in (|l^) because that used a cone algorithm 
that was not infrared safe, so it is not surprising that it performed badly. On the 
contrary, the one shown here uses the cone algorithm recommended by the authors 
of (m. The event definition is also somewhat different to the one described above, 
but the conclusions would not be expected to be affected by this. 
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FIG. 6. Reconstructed three-jet mass distribution of top quark candidates, accord- 
ing to the cluster (solid) and cone (dashed) algorithms, at calorimeter level, for a 
top quark of nominal mass 150 GeV. Taken from (fig). 
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FIG. 7. The fraction of two-jet events that contain 2, 3, 4, or 5 sub-jets when 
resolved at a scale j/ cu t for the cluster algorithm and a radius r for the cone algorithm, 
at the parton level (solid) and calorimeter level (dotted). Taken from (EOh . 



CALCULATING JET PROPERTIES 

To study QCD, and to understand the corrections and efficiencies of jet 
reconstruction, we are particularly interested in the internal properties of 
jets. We have already seen in Walter's talk (||) that external jet properties 
such as the transverse momentum spectra, rapidity distributions, dijet mass 
spectra, etc, are well-described by next-to-leading order QCD calculations. 
The internal properties that I will be concerned with are things like the energy 
profile of a jet, jet mass distributions and internal sub-jet structure, which are 
infrared-finite and can be calculated in perturbation theory. However, owing 
to the one-to-one correspondence between partons and jets in leading order 
calculations, the leading non-trivial term is the one given by "next-to- leading" 
order jet calculations. This means that they suffer all the usual problems with 
leading order calculations, like large sensitivity to the renormalization scale. 

For many of these jet properties, one encounters large logarithmic terms 
at all orders in perturbation theory and it is essential that these terms are 
reorganized ( "resummed" ) into an improved perturbation series. Examples 
include the energy profile at small angles and the sub-jet structure at small 
y cu t- This is conventionally done by the parton shower approach, either as a 
Monte Carlo simulation, or analytically. 

Parton Showers 

The cross-section for multiple emission factorizes in the collinear limit. This 
means that the production of additional partons close to a jet direction can 
be described in a time-ordered probabilistic way as a series of 1 — ► 2 splittings 
one after another. In the strongly-ordered limit, in which each splitting is 
much more collinear than the last, this is guaranteed to reproduce the exact 
multi-parton matrix element. This strongly-ordered region of phase-space 
contributes the leading logarithmic contribution to energy- weighted quantities 
like the energy profile. Thus any parton shower based on sequential collinear 
emission can predict these properties to leading logarithmic accuracy. It is 
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FIG. 8. Soft large-angle gluons cannot resolve the individual colour charges in the 
jet, so the coherent sum of emission off all the external lines is equivalent to emission 
by the original parton, imagined to be on-shell, i.e. before the other emission. 

worth noting that in the strongly-ordered limit, all measures of collinearity, 
like transverse momentum, virtuality and angle, are completely equivalent — 
algorithms that use different choices differ only by next-to-leading logarithms. 

However, many jet properties are sensitive to soft gluon emission, such as 
the sub-jet distributions. In this case, it seems at first sight like no probabilis- 
tic algorithm could possibly describe the full matrix element. This is because 
the matrix-element amplitude for any given final-state configuration is the 
sum of terms in which the gluon is emitted by each external leg, as illustrated 
in Fig. ||. Unlike the collinear case, all of these contribute to the leading 
logarithm and when taking the square of the sum the interference terms can 
be both positive and negative. Therefore it looks like we have to abandon 
the idea of describing the production of soft gluons in terms of independent 
probabilistic 1 — ► 2 splittings. 

However the fact that radiation from all the different emitters is coherent, 
a simple consequence of gauge invariance, comes to the rescue. It means that, 
after averaging over the relative azimuths of different emissions]], the emission 
of any given soft gluon can be attributed to a single coloured line somewhere 
in the diagram, either external, or internal but imagined to be on-shell (|23|). 
The fact that the emission probability from an internal line is what it would 
be if it was on-shell results in the famous angular ordering condition (p4|), 
namely that large angle emission should be treated as occuring earlier than 
smaller angle emission. This is the basis of coherent parton shower algorithms. 

It is important to note that this is not the same as using a standard 
virtuality-ordered algorithm and disallowing disordered emission, although 
claims are often made to the contrary. For example, in the configuration 
of Fig. H the total gluon emission probability is non-zero and proportional 
to Cf, the colour charge of a quark, whereas the veto algorithm would simply 
disallow all soft large-angle emission from this system. 

Many of the most important next-to-leading logarithmic contributions can 



4 In fact azimuth-dependent terms that average to zero can also be included in 
the parton shower framework and are automatically incorporated in the colour 
dipole cascade model 
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be incorporated by simple modifications to the leading logarithmic coherence- 
improved algorithm. A well-known example is the argument of the running 
coupling — large sub-leading corrections can be summed to all orders simply 
by using transverse momentum. 

Emission from initial-state coloured partons can also be treated in the same 
way. Collincar emission is responsible for the scaling violations in the parton 
distribution functions. This means that the input set of p.d.f.s can be used 
to guide the collinear emission and ensure that the distributions produced are 
the same as those used in theoretical calculations. This allows a 'backwards' 
evolution algorithm to be used (p5| ) with evolution starting at the hard inter- 
action and working downwards in scale back towards the incoming hadron. 
Partons produced by initial-state radiation subsequently undergo final-state 
showering. The coherence of radiation from different sources can again be 
incorporated by choice of the evolution scale, at least for not too small mo- 
mentum fractions^. In this case the appropriate variable is the product of the 



opening angle and the emitter's energy (27). 

So far I have described how the parton shower evolves, but it also needs a 
starting condition to evolve from. This is particularly important for interjet 
properties, as well as for setting the whole scale of the subsequent evolu- 
tion. It is provided by the lowest order matrix element for the process under 
consideration — jet production, prompt photon, top quark production and 
decay, or whatever. As shown in (pq), the coherence of radiation from the 
various emitters plays an important role here too. Each hard process can 
be broken down into a number of 'colour flow' diagrams, which control the 
pattern of soft radiation in that process. To leading order in the number of 
colours, N c , gluons can be considered as colour-anticolour pairs and a unique 
'colour partner' can be assigned to each parton for each colour flow. After 
azimuthal averaging, radiation from each parton is limited to a cone bounded 
by its colour connected partner, as illustrated for a particularly simple case 
in Fig. ^. It is important to note that although the radiation inside the cone 
around each parton is modeled as coming from that parton, it is the coherent 
sum of radiation from all emitters in the event even the internal lines, a point 
that I shall return to later. 

Violations of this picture come from two main sources — colour-suppressed 
terms and semi-soft terms. 1/N C is not such a small expansion parameter 
so one might expect non-leading colour terms to be a very important correc- 
tion. However, they tend to be also dynamically suppressed, since they are 
non-planar, and neglecting them is generally a good approximation. One can 
however find special corners of phase-space in which no parton shower algo- 



rithm based on the large- A^ c limit could be expected to be reliable (BEN). The 



5 Even at asymptotically small x, it is possible to construct a probabilistic parton 
shower algorithm (bq), but at present this is only of the forward evolution type, 
making it vastly inefficient for most studies. However, the authors o f (J26[ ) found 
very little discrepancy with their coherent parton shower algorithm (Bjjheven at 
the very small x values encountered in DIS at a LEP+LHC energy. 



colour-suppressed terms are generally negative and have not successfully been 
incorporated into a probabilistic Monte Carlo picture. The semi-soft correc- 
tions arise simply from the fact that the emission cones are derived in the 
limit of extremely soft emission that does not disturb the kinematics of the 
event at all, while harder emission makes the emitters recoil, disturbing their 
radiation patterns. Both of these effects mean that the initial conditions used 
in most implementations of coherent parton showers are actually too strong, 
since they prevent emission in regions where it is suppressed but not absent 
in the full calculation (i.e. outside the cones). 




FIG. 9. Feynman (left), colour flow (centre) and cmf frame (right) diagrams of 
qq — > q'q' showing the radiation cones of each parton. Most processes have many 
different colour flows. 
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Monte Carlo Parton Showers 



Monte Carlo shower algorithms implement the probabilistic interpretation 
in a very literal way, using a random number generator. They give fully 
exclusive distributions of all final-state properties with unit weight, meaning 
that any given configuration occurs with the same frequency as in nature. The 
relevant jet properties can be directly measured from the produced particles, 
exactly as in the experimental procedure. 

The available models were thoroughly reviewed in (|30|), so I only make a 
few brief comments about each. 



IS A JET (31) implements a virtuality-ordered collinear parton shower algo- 
rithm for both initial- and final-state evolution. It does not include any ac- 
count of coherence. It is specific to hadron-hadron collisions, so its parameters 
can be freely tuned to improve the fit to data. 

PYTHIA ( |32l ) also implements a virtuality-ordered collinear parton shower 
algorithm. In the case of final-state radiation, this is coherence-improved 
by disallowing emission with disordered opening angles. Coherence is now 
incorporated into the initial conditions for the initial-state radiation (this is 
called 'PYTHIA+' below and is now the default), but not within the evolution 
itself. The parameters are those tuned to e + e _ annihilation so the model is 
already well-constrained and predictive for hadron-hadron collisions. 



HERWIG (33) implements a complete coherence- improved parton shower 
algorithm for both initial- and final-state evolution, incorporating azimuthal 
correlations both within and between jets. The implementation is sufficiently 
precise that in limited regions of phase-space it is reliable to next-to-leading 
logarithmic accuracy, and HERWIG 's A parameter can be related to A^g 
(|34|). The parameters are again tuned to e + e~ and DIS data making the 
model highly predictive. 

ARIADNE (|3^) is a new Monte Carlo event generator that has only recently 
become available for hadron-hadron collisions, although it has been very suc- 
cessful in describing e + e _ and DIS data. It fully implements colour coherence, 
but in a completely different way to that described above, being based on the 
colour dipole cascade model (|2^). It makes no separation into initial- and 
final-state radiation, instead modeling the emission from the whole system in 
a coherent way. Its parameters are also tuned to c + c~ and DIS data. 

The issue of coherence has become paramount in describing e + e _ data, 
and models that do not implement it, at least approximately, are completely 
ruled out (see for example ( |36| ) and references therein). They are also strongly 
disfavoured in DIS. Data from the collider are now sufficiently precise that 
coherence effects can also be studied there. Fig. [tO]shows the results of a recent 
CDF study ( p7| ) of the distribution of the softest jet in three jet events. Only 
the models incorporating coherence, HERWIG and the updated PYTHIA, 
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FIG. 10. The rapidity distribution of the third-hardest jet in multi-jet events, 
taken from (^). 



can account for the data. It is possible that a similar interpretation could be 
made of the sub-jet data presented in (|l7|), although further study is needed 
to confirm this. 

Hard Partem Shower Emission 

By construction, coherence-improved shower algorithms reproduce the ex- 
act matrix element in the soft and collinear limits. Soft here means relative 
to the hardest scale in the event — a jet of 50 GeV or so could be consid- 
ered soft in the context of top quark production for example. However, jet 
cuts always pull us away from the soft and collinear regions since they require 
the emission to be resolvable. For example, the cross-section for 'Mercedes' 
events, with three jets of equal transverse momentum evenly spaced in az- 
imuth, is not completely negligible and since this is so far from any singular 
regions, we should certainly worry how well parton shower algorithms will re- 
produce it. There are two separate worries — whether there are phase-space 
regions that the algorithm does not populate and whether it is a good enough 
approximation in the regions is does populate. 

The accuracy of fixed-order matrix elements is complementary to that of 
parton showers — they are exact for hard well-separated jets (to leading order 
in a s ), but do not correctly account for multiple emission effects, so become 
increasingly unreliable if the cutoffs are made very small. It is natural to hope 
that the two approaches can be combined to yield a single algorithm that is 
uniformly reliable for hard and soft emission, collinear and well-separated. 
Several phenomenological attempts have been made in the past, but these 
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generally suffer a variety of problems. Firstly, it is essential that if the phase- 
space region is divided into two parts then they are smoothly matched at the 
boundary to prevent double-counting or under-counting. Ideally the bound- 
ary itself should be an arbitrary parameter so that one can explicitly check 
that varying it does not affect the output distributions. It is worth noting 
that for this to be true, it is incorrect to use the exact leading order matrix 
element, as a form factor must be introduced even within the hard region. 
Many algorithms work by correcting the first emission to reproduce the hard 
matrix element. However, the choice of ordering variable is scheme-dependent 
and so therefore is the definition of which emission is first. This problem is 
particularly severe in angular-ordered parton shower algorithms, where there 
are often several soft large-angle emissions before the hardest emission in the 
jet. If one nevertheless just corrects the first emission, one obtains a hard 
limit that is dependent on the soft infrared cutoff of the algorithm, which is 
clearly unphysical. 

These problems were discussed in (|3^), where self-consistent algorithms 
were proposed for correcting both deficiencies — filling empty regions of phase- 
space and correcting the distributions of all hard emission (not just the first) 
within the parton shower region. There is no conceptual barrier to applying 
them to arbitrary hard processes, such as top quark production and decay, 
and it would be straightforward to do so in an algorithm specifically designed 
with this in mind. However, it has proved rather tricky to weave them into 
the existing HERWIG algorithm and this has so far only been done for the 
simplest processes, e + e~ — * qq and DIS, eq — ► eq. Results for the latter are 
shown in Fig. 11 s . The hard correction, i.e. the filling of empty phase-space 



regions is an important correction, while the soft correction, within the parton- 



In fact almost perfect agreement with the data is obtained if the default param- 
eters tuned to e + e~ annihilation are used. This figure instead uses Hi's tuned set 
for comparison with their paper. 
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FIG. 11. The transverse momentum flow in the 7*p centre-of-mass frame of DIS 
events at small x, taken from (39|), data from (^). The first three curves are without 
detector acceptance and the dot-dash is the third after including the acceptance. 
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shower phase-space makes very little difference. The same is also true in 
e + e~. This can be interpreted as meaning that although HERWIG's parton 
shower algorithm does not fill the whole of phase-space owing to its over- 
strict implementation of the angular-ordering initial condition, it is a good 
approximation in the regions it does fill. 

It seems plausible that these corrections would fix the problems for top 
quark production and decay events reported in (^l|) and in Stephen Parke's 
talk (^), but we have not yet been able to explicitly check this. 



Azimuthal Decorrelation 



In the last couple of years there have been several proposed tests of 'non- 
DGLAP' evolution, i.e. quantities for which the collinear evolution described 
above is insufficient, as discussed in the talk by Vittorio del Duca (f43|). Al- 
though some of these are certainly sensitive to new small- a; dynamics, many 
can be reliably predicted using coherence-improved evolution. 

One such example is the decorrelation in azimuth of jet pairs as the rapidity 
interval between them increases. In the BFKL framework, this is attributed 



to emission from the i-channel exchanged gluon, as illustrated in Fig. 12 for 
a simple example, qq' — ► qq', chosen because it has a single colour flow. The 
full emission pattern of this system consists of emission proportional to Cf 
inside small cones of opening angle 9 S , the scattering angle and proportional 
to C a in the remainder of the solid angle. However, a coherence-improved 
parton shower would describe this situation as two colour lines, each of which 
have been scattered through almost 180°. Thus each quark emits proportional 
to Cf into a 'cone' of opening angle 180°— 9 S and, apart from the difference be- 
tween Ca and 2Cf, a 1/N% correction, it reproduces the full emission pattern. 
Thus to leading order in the number of colours, coherence-improved parton 
showers include soft emission from this internal line and should reproduce the 
full QCD result for azimuthal decorrelations. 




FIG. 12. BFKL ladder (left), full emission pattern (centre) and leading- N c emis- 
sion pattern (right) for small angle qq' scattering, contributing to the azimuthal 
decorrelation of the resulting jets. In the full case, emission from each quark is con- 
fined to its small forward cone and there is emission from the internal gluon, while 
in the leading- N c case each quark emits everywhere except its small backward cone 
and there is no other emission. 
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Analytical Parton Showers 



The probabilistic parton shower framework described above can also be used 
for analytical calculations. In this case, one sets up a probabilistic evolution 
equation for how the quantity of interest, for example the distribution of sub- 
jets, changes as a function of either the jet p t or resolution scale. This gives 
integro-differential equations that, with appropriate boundary conditions, can 
be solved analytically. However, this can only be done for experimental ob- 
servables in which the phase-space for multiple resolved emission factorizes. 
In the case of sub-jets this essentially requires that the resolution variable 
be of k± type. So far, only one calculation of this type has been made for 
hadron-hadron collisions (although there are several for e + c~ and DIS), for 
the average number of sub-jets resolved in an inclusive jet (44). The result 



is shown in Fig. [l3] and is compared to DO data in (0). The increasing im- 
portance of the all-orders resummation of next-to-leading logarithmic terms 
at small y cu t can be clearly seen. 



Hadronization 

The process by which coloured partons are confined into hadrons is not 
understood at a fundamental level at present, so phenomenological models 
must be used. The dynamics of the parton shower are preconftning, mean- 
ing that partons tend to end up close, in both phase-space and real-space, 
to their colour-connected partners. This suggests that hadronization is a 
fairly local process, taking place in the spatial regions between (but near to) 
the final-state partons. The string and cluster models used in PYTHIA and 
HERWIG respectively both implement this idea and both give good fits to 
e + e~ data. Like their parton shower algorithms, the model parameters are 
strongly constrained by e + e _ data, making them highly predictive for hadron- 
hadron collisions. The independent fragmentation model, in which single par- 
tons decay to hadrons according to longitudinal phase-space with no account 



- Leading order (al) 

- Leading order + NLLA resummation 




FIG. 13. The average number of sub-jets in a central 100 GeV jet, taken from ([lij). 
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of colour connections, is already ruled out by e + e~ data. 

Owing to lack of time in the talk and space in the proceedings, I can do no 
more than mention a new area that is sure to grow in the near future, the an- 
alytical calculation of hadronization corrections. Several different approaches 
appear to be converging on the result that perturbation theory, suitably mod- 
ified, is more powerful than we thought and may even be capable of predicting 
hadron-level cross-sections. The introduction of an effective running coupling 
that is integrable at low momenta ([l5]) alone may be sufficient to calculate the 
dominant corrections to infrared- finite quantities, although the factorization 
this implies ( [46|) has been questioned in renormalon-based approaches 
More details can also be found in George Sterman's talk 



CONCLUSION 

As I have stressed throughout the talk, jets are not fundamental objects in 
QCD, but are artificial event properties defined by hand. Different definitions 
have good and bad points and the results will be a function of these defini- 
tions. We need reliable methods to calculate these event properties from the 
fundamental objects of perturbative QCD, quarks and gluons. 

Cone-type jet algorithms are simple in principle, but turn out to be com- 
plicated in practice, owing to jet overlap problems. Cluster-type algorithms 
are more complicated to define, but turn out to be simple in practice, since 
exactly the same algorithm can be used for one or two partons as on the 
hadron-level or detector-level final state. The k± cluster algorithm has a vari- 
ety of practical and theoretical advantages both for jet physics and for event 
reconstruction. 

For jet properties, very few analytical calculations arc available, particularly 
in the important region of small resolution parameters, but this is expected 
to improve in the future. Modern coherence-improved Monte Carlo models 
are very sophisticated implementations of perturbative QCD plus very con- 
strained models of hadronization and are highly predictive for hadron-hadron 
collisions. However, there are areas in which they need to be improved, most 
notably in matching parton showers with exact matrix elements to improve 
the description of very hard emission. 

It is clear that many areas outside the standard QCD arena are becoming 
increasingly reliant on jet physics. For example the error on the top mass 
will soon become dominated by the treatment of gluon radiation and it will 
become absolutely essential to consider the problems discussed in this talk. 
Namely, how can the present jet definition and modeling be improved upon? 
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